AITopics | total number

Collaborating Authors

total number

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ATechnical Lemmas

Neural Information Processing SystemsApr-30-2026, 03:55:35 GMT

The proof is an induction on k. Consider the general case p2k+1. It is easy to see that g (x) = ex p2k(x) and g (x) = ex p2k 1(x). By the induction hypothesis, g 0 and therefore g is convex. Thus, the minimum of g is given by its stationary points. It is easy to observe that x = 0 is indeed a stationary point. Thus, minx R g(x) = g(0) = 0, which finishes the proof.

artificial intelligence, hei, probability, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

e82ef7865f29b40640f486bbbe7959a7-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 03:40:23 GMT

data mining, machine learning, nullk null, (20 more...)

Neural Information Processing Systems

Country:

Europe > France (0.28)
North America > Canada (0.28)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

ROIMaximization in Stochastic Online Decision-Making Supplementary Material ADecision-Making Policies

Neural Information Processing SystemsApr-25-2026, 19:25:45 GMT

In this section, we give a formal functional definition of the decision-making policies introduced in Section 3. During each task, the agent sequentially observes samples xi [ 1,1] representing realizations of stochastic observations of the current innovation value. A map τ: [ 1,1]N N is a duration (of a decision task) if for all x [ 1,1]N, its value d= τ(x) Nat xdepends only on the first dcomponents x1,x2,...,xd of x = (x1,x2,...); mathematically speaking, if X is a discrete stochastic process (i.e., a random sequence), then τ(X) is a stopping time with respect to the filtration generated by X. This definition reflects the fact that the components x1,x2,... of the sequence x = (x1,x2,...) are generated sequentially, and the decision to stop testing an innovation depends only on what occurred so far. A concrete example of a duration function is the one, mentioned in the introduction and formalized in (4), that keeps drawing samples until the empirical average of the observed values xi surpasses/falls below a certain threshold, or a maximum number of samples have been drawn.

artificial intelligence, decision-making supplementary material adecision-making policy, nex, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.34)

Add feedback

NeurIPS2021_ImperfectCommmunicationBandits

Madhu

Neural Information Processing SystemsApr-25-2026, 14:38:06 GMT

We consider the case where each message fails with probability 1 p and each agent i uses the messages it receives from its neighbors with probability pi.This is equivalent to each agent ireceiving messages from its neighbors with probability pip.Let 1{(i,j) 2 Et}be the indicator random variable that takes value 1 if agent i receives reward value and arm id from agent j at time t and 0 otherwise. We start by proving some useful lemmas. Lemma 1. (Restatement of results from [3]) Let k = Thus we have P Ai(t+1) = k,Nik(t) > k P bµi1(t) µ1 Ci1(t) +P bµik(t) µk +Cik(t) This concludes the proof of Lemma 1. Lemma 2. Let (G) is the clique covering number of graph G. Let k = Let C be a non overlapping clique covering of G. Then we have that k |C| < Nik( ik,C) k. From regret results it follows that regret for this case is greater than the regret for the case where ik,C < k,C for some (or all) i. 13 We analyse the expected number of times agents pull suboptimal arm k as follows, X P bµi1(t) µ1 Ci1(t) +P bµik(t) µk +Cik(t), (29) where (a) follows from the fact that clique covering is non overlapping. This concludes the proof of Lemma 2. Lemma 3. Let di(G) be the degree of agent i in graph G.

agent, artificial intelligence, nik, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

3556a3018cce3076e27dbbf9645b44d5-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 11:00:54 GMT

artificial intelligence, experiment, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

035f23c0ac4cf2b73b9365ba5a98ad56-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 08:16:02 GMT

artificial intelligence, machine learning, normalization layer, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Noise Immunity in In-Context Tabular Learning: An Empirical Robustness Analysis of TabPFN's Attention Mechanisms

Hu, James, Ghelichi, Mahdi

arXiv.org Machine LearningApr-9-2026

Tabular foundation models (TFMs) such as TabPFN (Tabular Prior-Data Fitted Network) are designed to generalize across heterogeneous tabular datasets through in-context learning (ICL). They perform prediction in a single forward pass conditioned on labeled examples without dataset-specific parameter updates. This paradigm is particularly attractive in industrial domains (e.g., finance and healthcare) where tabular prediction is pervasive. Retraining a bespoke model for each new table can be costly or infeasible in these settings, while data quality issues such as irrelevant predictors, correlated feature groups, and label noise are common. In this paper, we provide strong empirical evidence that TabPFN is highly robust under these sub-optimal conditions. We study TabPFN and its attention mechanisms for binary classification problems with controlled synthetic perturbations that vary: (i) dataset width by injecting random uncorrelated features and by introducing nonlinearly correlated features, (ii) dataset size by increasing the number of training rows, and (iii) label quality by increasing the fraction of mislabeled targets. Beyond predictive performance, we analyze internal signals including attention concentration and attention-based feature ranking metrics. Across these parametric tests, TabPFN is remarkably resilient: ROC-AUC remains high, attention stays structured and sharp, and informative features are highly ranked by attention-based metrics. Qualitative visualizations with attention heatmaps, feature-token embeddings, and SHAP plots further support a consistent pattern across layers in which TabPFN increasingly concentrates on useful features while separating their signals from noise. Together, these findings suggest that TabPFN is a robust TFM capable of maintaining both predictive performance and coherent internal behavior under various scenarios of data imperfections.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2604.04868

Country: North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Instance-optimal stochastic convex optimization: Can we improve upon sample-average and robust stochastic approximation?

Jiang, Liwei, Pananjady, Ashwin

arXiv.org Machine LearningMar-27-2026

We study the unconstrained minimization of a smooth and strongly convex population loss function under a stochastic oracle that introduces both additive and multiplicative noise; this is a canonical and widely-studied setting that arises across operations research, signal processing, and machine learning. We begin by showing that standard approaches such as sample average approximation and robust (or averaged) stochastic approximation can lead to suboptimal -- and in some cases arbitrarily poor -- performance with realistic finite sample sizes. In contrast, we demonstrate that a carefully designed variance reduction strategy, which we term VISOR for short, can significantly outperform these approaches while using the same sample size. Our upper bounds are complemented by finite-sample, information-theoretic local minimax lower bounds, which highlight fundamental, instance-dependent factors that govern the performance of any estimator. Taken together, these results demonstrate that an accelerated variant of VISOR is instance-optimal, achieving the best possible sample complexity up to logarithmic factors while also attaining optimal oracle complexity. We apply our theory to generalized linear models and improve upon classical results. In particular, we obtain the best-known non-asymptotic, instance-dependent generalization error bounds for stochastic methods, even in linear regression.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2603.25657

Country: